Extracting Social Networks and Biographical Facts From Conversational Speech Transcripts

نویسندگان

  • Hongyan Jing
  • Nanda Kambhatla
  • Salim Roukos
چکیده

We present a general framework for automatically extracting social networks and biographical facts from conversational speech. Our approach relies on fusing the output produced by multiple information extraction modules, including entity recognition and detection, relation detection, and event detection modules. We describe the specific features and algorithmic refinements effective for conversational speech. These cumulatively increase the performance of social network extraction from 0.06 to 0.30 for the development set, and from 0.06 to 0.28 for the test set, as measured by f-measure on the ties within a network. The same framework can be applied to other genres of text — we have built an automatic biography generation system for general domain text using the same approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ethnomethodology and Conversational Analysis

In a speech community, people utilize their communicative competence which they have acquired from their society as part of their distinctive sociolinguistic identity. They negotiate and share meanings, because they have commonsense knowledge about the world, and have universal practical reasoning. Their commonsense knowledge is embodied in their language. Thus, not only does social life depend...

متن کامل

FROntIER: A Framework for Extracting and Organizing Biographical Facts in Historical Documents

The tasks of entity recognition through ontological commitment, fact extraction and organization in conformance to a target schema, and entity deduplication have all been examined in recent years, and systems exist that can perform each individual task. A framework combining all these tasks, however, is still needed to accomplish the goal of automatically extracting and organizing biographical ...

متن کامل

Lesion correlates of conversational speech production deficits.

We assess brain areas involved in speech production using a recently developed lesion-symptom mapping method (voxel-based lesion-symptom mapping, VLSM) with 50 aphasic patients with left-hemisphere lesions. Conversational speech was collected through a standardized biographical interview, and used to determine mean length of utterance in morphemes (MLU), type token ratio (TTR) and overall token...

متن کامل

Acoustic Model Training with Detecting Transcription Errors in the Training Data

As the target of Automatic Speech Recognition (ASR) has moved from clean read speech to spontaneous conversational speech, we need to prepare orthographic transcripts of spontaneous conversational speech to train acoustic models (AMs). However, it is expensive and slow to manually transcribe such speech word by word. We propose a framework to train an AM based on easy-to-make rough transcripts ...

متن کامل

Intention Extraction from Text Messages

Identifying intentions of users plays a crucial role in providing better user services, such as web-search and automated message-handling. There is a significant literature on extracting speakers’ intentions and speech acts from spoken words, and this paper proposes a novel approach on extracting intentions from non-spoken words, such as web-search query texts, and text messages. Unlike spoken ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007